feat(trial): zero-friction URL-to-workspace onboarding MVP by simple-agent-manager[bot] · Pull Request #758 · raphaeltm/simple-agent-manager

simple-agent-manager · 2026-04-18T21:35:34Z

Summary

Implements the zero-friction URL-to-workspace onboarding MVP from idea 01KPGJQ853C44JEREXWEZS1GQ8. Anonymous visitors paste a public GitHub repo URL, watch a live discovery agent analyze it, and get pre-generated suggestion chips that lead into a full SAM workspace after a 2-click login.

Built as a single orchestrated PR via 5 waves (foundation + 4 parallel tracks + integration) against the sam/trial-onboarding-mvp integration branch. Not to be merged to main — this is flagged for @raphaeltm manual review before merge and before production configuration is applied.

cc @raphaeltm — Configuration Checklist Before Merge

Staging (`sammy.party`) — zero manual steps required

The deploy pipeline provisions + flips everything automatically:

TRIAL_CLAIM_TOKEN_SECRET — auto-generated by Pulumi (infra/resources/secrets.ts), stored encrypted in the Pulumi R2-backed state, pushed as a Worker secret by configure-secrets.sh (commit 086f4ded)
trials:enabled=true in KV — written by the staging deploy workflow on every run (deploy-reusable.yml + commit b15ca27c removing an invalid --remote flag)
TRIAL_LLM_PROVIDER=workers-ai — already wired in wrangler.toml vars
TRIAL_MODEL=@cf/meta/llama-3.1-8b-instruct
TRIAL_MONTHLY_CAP=1500
sam_anonymous_trials sentinel user — seeded via migration 0043

→ Nothing to click on the staging environment. A fresh workflow_dispatch on deploy-staging.yml gives you a working trial surface.

Production (`simple-agent-manager.org`) — one manual step (the key)

Procure Anthropic API key budgeted for trials
wrangler secret put ANTHROPIC_API_KEY_TRIAL --env production (separate from platform key)
Set TRIAL_LLM_PROVIDER=anthropic, TRIAL_MODEL=claude-3-5-haiku-latest, TRIAL_AGENT_TYPE=claude-code in production vars
Set TRIAL_MONTHLY_CAP to your preferred prod cap (default 500)
Flip the kill switch when ready: pnpm --filter @simple-agent-manager/api exec wrangler kv key put "trials:enabled" "true" --binding KV --env production
Confirm sam_anonymous_trials sentinel user exists on prod D1
Confirm trial_counter KV namespace + TrialCounter DO bindings exist in prod wrangler env

Cookies

HMAC key for trial fingerprint cookies reuses TRIAL_CLAIM_TOKEN_SECRET (auto-provisioned on staging, manual on prod if desired).

Kill Switches

Set KV trials:enabled=false to instantly pause trial creation. /try cleanly falls back to "Trials are paused" — verified on staging.
TRIAL_MONTHLY_CAP=0 is also a hard stop.

What Shipped

Wave 0 — Foundation (`e253c08e`)

Shared Valibot schemas (packages/shared/src/trial.ts) for requests, responses, SSE events, idea shape
D1 migration 0043: trial_projects, trial_waitlist, sam_anonymous_trials sentinel user
Durable Objects: TrialCounter (monthly cap), TrialEventBus (SSE fan-out)
HMAC-signed cookie helpers (apps/api/src/services/trial/cookies.ts) for fingerprint (7d) and claim (48h) tokens
Kill-switch + cap helpers, discovery prompt template, route stubs

Wave 1 Track A — Backend Lifecycle (`4ca29ea6`)

POST /api/trial/create — validates repo URL, checks kill switch + cap, creates project under sentinel user, starts discovery session
GET /api/trial/status — enabled + remaining slots + reset date (public, no auth)
POST /api/trial/waitlist — cap-exceeded email capture
Cron: month-rollover counter reset + 30d waitlist purge

Wave 1 Track B — Backend Claim + SSE (`6ba2e101`)

GET /api/trial/:trialId/events — SSE stream multiplexed from TrialEventBus DO
POST /api/trial/:trialId/claim — post-OAuth handler that transfers the anonymous project from sentinel user to the newly-signed-in user, validates claim cookie
OAuth callback integration (claim=<trialId> query param round-trip)
Agent wiring: discovery session uses TRIAL_LLM_PROVIDER + TRIAL_MODEL

Wave 1 Track C — Frontend Discovery (`e8088705`)

/try landing page (mobile-first, repo URL input, kill-switch + cap-exceeded fallbacks)
/try/:trialId discovery feed consuming the SSE event stream
/try/cap-exceeded + /try/waitlist/thanks pages
React Router entries wired into App.tsx

Wave 1 Track D — Frontend Chat Gate (`1114c8fc`)

ChatGate component: suggestion chip carousel + textarea + send button
LoginSheet modal triggering GitHub OAuth with claim cookie preserved
useTrialDraft hook: localStorage persistence of the draft across the OAuth round-trip
useTrialClaim hook: post-login auto-submit of the stashed draft to the claimed project's chat

Wave 2 — Integration, Automation, and Live Fix

Merged all 4 Wave 1 tracks into sam/trial-onboarding-mvp. Two conflicts resolved:
- apps/api/src/env.ts — kept both Track A + Track B TRIAL_* env vars.
- apps/web/src/components/trial/ChatGate.tsx — kept Track D's real implementation; adapted Track C's TryDiscovery to Track D's TrialIdea contract + onAuthenticatedSubmit callback.
Automated the staging trial secret (commit 086f4ded): added infra/resources/secrets.ts entry that auto-generates TRIAL_CLAIM_TOKEN_SECRET via @pulumi/random, and wired configure-secrets.sh to push it as a Worker secret. No manual wrangler secret put on staging ever.
Automated the staging kill-switch (commits 086f4ded + b15ca27c): added a conditional step to .github/workflows/deploy-reusable.yml that writes trials:enabled=true to KV on every staging deploy (and only staging). Initial attempt used --remote, which is not a valid flag for wrangler kv key put — removed in b15ca27c.
Discovered and fixed a Wave 1 integration bug (commit db1d6332): Track A was persisting new trials to D1 only, while Track B readers (events.ts, claim.ts, trial-runner.ts) look up trials in KV via readTrial(). Every SSE connection 404'd with "Trial not found". Fix mirrors the trial to KV in POST /api/trial/create after the D1 insert, before issuing cookies, with rollback on KV failure (D1 row deleted, TrialCounter slot released). writeTrial() also hardened to skip the trial-by-project: index when projectId is empty (would otherwise collide all pending trials on a single key). Added regression test asserting KV.put("trial:<id>", ...) is invoked on the happy path.

Non-negotiable Constraints Verified

Mobile-first (375×667 authoritative) — all four trial screens rendered and screenshot-verified at mobile width
Public GitHub repos only — GITHUB_REPO_URL_REGEX in shared schemas
Locked initial prompt — discovery prompt template owned by the backend; user cannot write the first message
Login gate on chat interactions — ChatGate triggers LoginSheet on any send attempt by an anonymous visitor
Monthly cap + kill switch — TrialCounter DO + TRIAL_ENABLED env var
Staging uses opencode + Workers AI; production will use claude-code + Anthropic
Valibot for runtime validation — every request schema in packages/shared/src/trial.ts
System user pattern — no schema change to projects.userId; anonymous projects owned by sam_anonymous_trials until claimed
HMAC-signed claim cookie — uses auto-provisioned TRIAL_CLAIM_TOKEN_SECRET

Local Quality Gates

pnpm typecheck — clean across all packages
pnpm lint — 0 errors
API unit tests — 3773 / 3773 passing (includes new writeTrial regression test)
Web unit tests — 1863 / 1863 passing

Staging Deployments

Run	Commit	Result
24614206706	`c2780059`	✅ initial merge deploy
24614985380	pre-`b15ca27c`	❌ `Unknown argument: remote` — fixed by removing `--remote` flag
24615223155	post-`db1d6332`	✅ final green with kill-switch KV put + all fixes

Staging Verification (Playwright + curl, live app)

TRIAL_ENABLED=true on staging, end-to-end happy path exercised:

Check	Result
`GET /api/trial/status`	`{"enabled":true,"remaining":1500,"resetsAt":"2026-05-01"}` ✅
`POST /api/trial/create` with `https://github.com/sindresorhus/is`	`201` with `Set-Cookie: sam_trial_fingerprint=…` + `sam_trial_claim=…` ✅
`GET /api/trial/:trialId/events` via real cookies	`HTTP/2 200`, `content-type: text/event-stream`, `: connected` heartbeat ✅
`/try` landing form submission on mobile 375×667	navigates to `/try/:trialId`, ChatGate renders "Live" status, feed waits for events, zero console errors ✅
Same on desktop 1280×800	✅

Screenshots: trial-sse-live-mobile.png, trial-sse-live-desktop.png (in .codex/tmp/playwright-screenshots/).

Regression spot-check

Authenticated via smoke-test token login → /dashboard renders, project list loads, 0 console errors
Navigation sidebar, command palette, notifications panel all intact
/health → 200 healthy

What was NOT verified end-to-end

The OAuth claim + post-login auto-submit leg (chat gate → login sheet → GitHub OAuth → /api/trial/:trialId/claim → stashed draft replay) requires a real GitHub OAuth round-trip with a human. All individual components have unit + integration coverage; the OAuth leg is gated behind a real sign-in and deferred to Raphaël's manual review.

Review Status

Full specialist review was not dispatched because this PR is flagged for manual review by @raphaeltm before merge. The needs-human-review label is applied. Raphaël will decide whether to dispatch additional reviewers, flip production config, and proceed to merge.

Do NOT Merge Yet

❌ Do NOT merge to main until Raphaël has reviewed the configuration checklist.
❌ Do NOT deploy to production until the Anthropic key is procured and the OAuth claim leg has been exercised at least once.

🤖 Generated with Claude Code

Lays groundwork for /try — shared types (Valibot), DB migration 0043 (system user sentinel + trial_waitlist table), wrangler TRIAL_COUNTER DO binding (v7 migration) + trial env vars, trial services (HMAC-signed cookies with constant-time compare, KV kill-switch with 30s cache + fail-closed, discovery prompt), 501 route stubs under /api/trial/*, TrialCounter DO with atomic transactionSync increment/decrement, frontend Try/TryDiscovery stubs mounted at /try + /try/:trialId, operator docs at docs/guides/trial-configuration.md, and 43 unit tests covering cookie round-trip/tamper/expiry, kill-switch cache/TTL/fail-closed, and TrialCounter cap enforcement. Trials remain disabled by default (kill-switch fails closed) so this is safe to deploy without setting TRIAL_CLAIM_TOKEN_SECRET. Wave 1 will wire the live create/events/claim/waitlist handlers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Implements backend lifecycle for zero-friction trial onboarding (Wave 1 Track A): - trials table + sentinel-installation workaround (migration 0044) - TrialCounter DO: fetch surface + tryIncrement/prune RPC methods - POST /api/trial/create with Valibot validation, kill-switch gate, GitHub repo probe (size/privacy), DO slot allocation, and counter-decrement rollback on D1 failure - GET /api/trial/status with fail-closed fallback when DO throws - POST /api/trial/waitlist with lowercase-email dedupe via onConflictDoNothing(email, resetDate) - Three scheduled modules wired into cron dispatch: - trial-expire: 5-min sweep marks expired trials - trial-rollover: monthly DO pruning (0 3 1 * *) - trial-waitlist-cleanup: daily notified-row purge (0 4 * * *) - All configurable via DEFAULT_* constants + env overrides (Principle XI) - 92 new behavioral tests covering resolution branches, DO RPC surface, fallback semantics, cookie issuance, and fail-closed error paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Builds the frontend components that gate the trial experience behind GitHub auth — a chat input with suggestion chips for anonymous users, and a login sheet that opens when they send their first message. Integration into TryDiscovery (SSE streaming `trial.idea` events) lands in wave-2 alongside the live /claim handler. Components - ChatGate: autogrowing textarea + horizontally-scrolling chip row; Cmd/Ctrl+Enter submits, Enter inserts newline; disabled state when empty/whitespace; surfaces submit errors without clearing the draft - LoginSheet: responsive dialog (mobile bottom-sheet, desktop centered modal) with Escape/backdrop/close-button dismissal, focus trap between primary CTA + close, body scroll lock, return-to URL construction (trialId URL-encoded, ?claim=1 sentinel) - SuggestionChip: 44px-tall touch target with title + optional summary, aria-label compose, disabled state Hooks - useTrialDraft: per-trialId localStorage draft with 400ms debounce (flush-on-unmount), synchronous writes when debounceMs=0, rehydrates on trialId change, no-ops with undefined trialId - useTrialClaim: idle → claiming → submitting → done/error state machine; injectable claim/submit fns for testing; StrictMode-safe (single claim per mount); clears draft only on successful submit; preserves projectId when submit fails so UI can retry Harness + tests - TrialChatGateHarness at /__test/trial-chat-gate (public, not linked from nav) renders ChatGate + LoginSheet with query-param-driven mock data (ideas=0..20, long=1, auth=1, loginOpen=1) so Playwright can capture screenshots without hitting the real claim flow - 43 new unit tests across components + hooks covering rendering, interactions, persistence, error states, focus management - 13 Playwright visual scenarios at 375x667 + 1280x800: empty state, 1/5/20 chips (page-level overflow asserted false — chip row owns its horizontal scroll), long-text wrapping, anonymous send opening LoginSheet, bottom-sheet vs centered-modal layouts, 44px touch targets on send button + suggestion chips Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Wire trial onboarding backend so the post-OAuth claim flow and the per-trial event stream work end-to-end. - TrialEventBus DO: in-memory ring buffer (MAX_BUFFERED_EVENTS=500) with long-poll /poll, /append with terminal-event auto-close, /close, waiter-wake semantics. Configurable via TRIAL_EVENT_BUS_DEFAULT_POLL_TIMEOUT_MS. - trial-store service: KV-backed writeTrial/readTrial/markTrialClaimed with 3-key indexing (by trialId, by projectId, by fingerprint). - trial-runner: mode-aware config resolution (staging=opencode+workers-ai, production=claude-code+anthropic); production requires ANTHROPIC_API_KEY_TRIAL. startDiscoveryAgent creates chat + ACP session with discovery prompt. emitTrialEvent/emitTrialEventForProject append to TrialEventBus best-effort. - GET /api/trial/:trialId/events: fingerprint-cookie-authenticated SSE. Verifies trial record + HMAC signature + UUID match (fails closed on any mismatch). Heartbeat every TRIAL_SSE_HEARTBEAT_MS (default 15s); long-poll DO every TRIAL_SSE_POLL_TIMEOUT_MS; max duration TRIAL_SSE_MAX_DURATION_MS. Closes on terminal event. - POST /api/trial/claim: auth-required; verifies HMAC claim cookie; atomic D1 UPDATE with WHERE userId=TRIAL_SENTINEL_USER_ID precondition; clears claim cookie; returns {projectId, claimedAt}. Returns 409 on UPDATE-changes=0 race. - OAuth callback hook (maybeAttachTrialClaimCookie): on 2xx/3xx response from /callback/github, if a valid fingerprint cookie maps to an unclaimed non-expired trial, sign a claim token, set sam_trial_claim cookie, and rewrite Location to https://app.${BASE_DOMAIN}/try/:trialId?claim=1. - Env + wrangler binding for TRIAL_EVENT_BUS Durable Object. 70 new unit tests (6 files) cover DO long-poll/waiter-wake/terminal-close, SSE auth-failure matrix + happy path, claim route 400/404/409/200 branches, oauth-hook bail-out matrix + rewrite happy path, trial-runner config resolution + error paths, and trial-store round-trips.

…tion

…ding integration

…ntegration

Replaces Wave 0 stubs with full trial discovery flow: - Try landing page with GitHub URL validation + error branches (invalid_url, repo_private, trials_disabled, cap_exceeded, existing_trial) - TryDiscovery streams SSE events (started, progress, knowledge, idea, ready) with exponential backoff reconnect (max 5 retries) and renders repo header, progress, knowledge graph, ideas, and workspace-ready CTA - TryCapExceeded page with waitlist email capture + inline validation - TryWaitlistThanks confirmation page - trial-api client: createTrial, joinWaitlist, openTrialEventStream - ChatGate stub placeholder for Track D integration Tests: - Vitest component tests for Try + TryCapExceeded (11 cases: URL validation, success nav, existing-trial resume, each error branch, email validation, waitlist submit, API error) - Playwright visual audit at 375x667 and 1280x800 covering landing, discovery (streaming/ready/empty), cap-exceeded, waitlist-thanks, and all inline error states — overflow asserted on every test Mobile-first with design tokens; 56px primary CTA, 44px secondary touch targets; env(safe-area-inset-*) padding. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… integration Resolves conflict in ChatGate.tsx by keeping Track D's real implementation; adapts TryDiscovery to Track D's ChatGate contract (TrialIdea shape, onAuthenticatedSubmit handler that navigates to the claimed project chat with the message staged in sessionStorage).

… kill-switch Previously, self-hosters had to manually run `wrangler secret put TRIAL_CLAIM_TOKEN_SECRET` and `wrangler kv key put trials:enabled true` before the /try flow would work on staging. Wire both into the standard deployment pipeline so staging trials are live out of the box. Changes: - infra/resources/secrets.ts: add `trial-claim-token-secret` RandomId resource (32 bytes base64) + export `trialClaimTokenSecret` Pulumi output, same persistence pattern as encryptionKey / jwtPrivateKey. - infra/index.ts: re-export the new output. - scripts/deploy/configure-secrets.sh: read trialClaimTokenSecret from Pulumi state and set it as a required Worker secret on every deploy. - .github/workflows/deploy-reusable.yml: add a staging-only step that sets KV `trials:enabled=true` via wrangler after the worker deploys. Production stays opt-in per spec (operator flips the flag manually when ready to accept live trial traffic). - docs/guides/trial-configuration.md: document the automation — no more manual secret-put or kv-put steps for staging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

`wrangler kv key put` writes to remote by default; --remote is not a valid flag for that subcommand and caused the staging deploy's trial kill-switch step to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…olve it Track A (create.ts) inserted trial records into D1 only; Track B readers (events.ts, claim.ts, trial-runner.ts) all look trials up via trial-store.readTrial() which reads from KV. The result: every SSE connection 404'd with "Trial not found or expired" seconds after the trial was created. Integration fix: - create.ts calls writeTrial() after the D1 insert, with projectId='' (Track B's orchestrator rewrites the KV record once the project row exists). On KV failure, roll back the D1 row and release the TrialCounter slot so we don't burn a cap entry. - writeTrial() skips the trial-by-project index when projectId is empty, preventing all pending trials from colliding on `trial-by-project:`. - events.ts: use errors.notFound('Trial') — previous argument produced doubled "Trial not found or expired not found". Added a regression test asserting writeTrial is invoked from the happy path (captures the exact KV put) so this bug cannot silently recur. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

simple-agent-manager · 2026-04-18T22:31:44Z

Staging verification update — trials automation + integration fix

Two follow-up commits landed after the initial PR:

1. Deploy automation (commits 086f4ded + b15ca27c)

TRIAL_CLAIM_TOKEN_SECRET is now auto-provisioned by Pulumi (infra/resources/secrets.ts) and pushed by configure-secrets.sh on every deploy — no manual wrangler secret put
trials:enabled KV flag is set automatically by deploy-reusable.yml on staging deploys — no manual wrangler kv key put
Production remains opt-in (operator flips the flag when ready)

2. Wave 1 integration bug fix (commit db1d6332)

Track A persisted trials to D1; Track B read from KV via trial-store.readTrial(). Nothing wrote KV → every SSE /events call returned 404 "Trial not found".
Fix: create.ts calls writeTrial() after the D1 insert with projectId='' (Track B's orchestrator rewrites the record once the project row exists). On KV failure, D1 row is rolled back and the TrialCounter slot released.
Hardened writeTrial() to skip the by-project index when projectId is empty, preventing pending-trial collisions.
Added regression test asserting writeTrial is invoked — this bug cannot silently recur.

Staging verification evidence (run 24615223155, 2026-04-18 22:22Z):

✅ /api/trial/status → {"enabled":true,"remaining":1498,"resetsAt":"2026-05-01"}
✅ POST /api/trial/create with public repo URL → 201 with set-cookie fingerprint + claim cookies, returns trialId
✅ GET /api/trial/:trialId/events with fingerprint cookie → HTTP/2 200 text/event-stream, : connected heartbeat received
✅ /try/:trialId page renders ChatGate in "Live" state (green), zero console errors, on mobile 375×667 and desktop 1280×800

Updated configuration checklist for @raphaeltm:

✅ ~~TRIAL_CLAIM_TOKEN_SECRET~~ — auto-provisioned by Pulumi, no action needed
✅ ~~Staging kill-switch~~ — auto-set by deploy workflow, no action needed
Production kill-switch — flip trials:enabled=true manually when ready: pnpm --filter @simple-agent-manager/api exec wrangler kv key put "trials:enabled" "true" --binding KV --env production
Production Anthropic key — set ANTHROPIC_API_KEY_TRIAL via wrangler secret put ... --env production once procured (required for production trials — staging uses Workers AI, no key needed)
Optional tunables in apps/api/wrangler.toml: TRIAL_MONTHLY_CAP (default 1500), TRIAL_WORKSPACE_TTL_MS (default 20 min), TRIAL_DATA_RETENTION_HOURS (default 168)

Production deploy and merge remain deferred per your instructions.

…760) * task: move trial-orchestrator-wire-up to active Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(shared): add trial orchestrator timing/retry constants Introduce DEFAULT_TRIAL_ORCHESTRATOR_* and DEFAULT_TRIAL_KNOWLEDGE_* constants used by the alarm-driven TrialOrchestrator DO and the fast-path GitHub knowledge probes fired from POST /api/trial/create. Every value is env-var overridable (Constitution Principle XI). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(trial): add TrialOrchestrator DO binding, env vars, sentinel installation - Declare TRIAL_ORCHESTRATOR DO binding + v9 migration in wrangler.toml - Extend Env interface with TrialOrchestrator/Knowledge tuning knobs and TRIAL_ANONYMOUS_INSTALLATION_ID override - Migration 0045 seeds the system_anonymous_trials_installation sentinel row so anonymous trial projects can satisfy the NOT NULL + FK constraint on projects.installation_id without owning a real GitHub App install The DO class itself is added in the next commit. * feat(trial): add TrialOrchestrator DO state machine Adds the alarm-driven TrialOrchestrator Durable Object (one per trialId) that replaces the fire-and-forget `waitUntil(provisionTrial())` pattern with a resumable state machine. Module layout mirrors TaskRunner: - types.ts — TrialOrchestratorStep union + persisted state shape - helpers.ts — re-exports TaskRunner helpers; adds sentinel-user / sentinel-installation resolvers + safeEmitTrialEvent. - steps.ts — per-step handlers (project_creation, node_selection, node_provisioning, node_agent_ready, workspace_creation, workspace_ready, discovery_agent_start, running). - index.ts — DO class: start(), alarm() dispatch, backoff retry, overall-timeout guard, trial.error emission on failure. Each step emits `trial.progress` at entry so the SSE stream reflects where the orchestrator is. Terminal `running` step is idle — the ACP bridge (wired separately) is responsible for emitting `trial.ready` after the discovery agent produces its first assistant turn. All timing/retry knobs read from env vars with DEFAULT_* fallbacks (Constitution Principle XI). Adds two new optional env fields: TRIAL_VM_SIZE and TRIAL_VM_LOCATION for trial-specific VM overrides. Exports the class from apps/api/src/index.ts so the Workers runtime can instantiate it via the TRIAL_ORCHESTRATOR binding (already declared in wrangler.toml v9 migration). Task: tasks/active/2026-04-19-trial-orchestrator-wire-up.md * feat(trial): bridge ACP/MCP events into trial SSE stream Adds a dedicated `services/trial/bridge.ts` module with three helpers that hook into existing hot paths and fan qualifying events out as `trial.*` SSE events: - bridgeAcpSessionTransition: `running` → trial.ready (with workspaceUrl derived from BASE_DOMAIN + workspaceId), `failed` → trial.error. - bridgeKnowledgeAdded: fires trial.knowledge when the discovery agent adds a knowledge observation via MCP. - bridgeIdeaCreated: fires trial.idea with a summary-clipped excerpt when the discovery agent creates an idea via MCP. All three helpers short-circuit on non-trial projects after a single `readTrialByProject(env, projectId)` KV lookup, so normal (non-trial) project traffic only pays that one extra KV read on qualifying events. Hook sites: - ProjectData DO `transitionAcpSession` — dynamic-imports the bridge and dispatches after the transition succeeds, guarded by `if (projectId)` and wrapped in try/catch so bridge errors never block the transition. Casts `this.env` through unknown to the worker-scope Env because the DO's local Env type is intentionally narrow. - `handleAddKnowledge` MCP handler — dispatches after addKnowledgeObservation. - `handleCreateIdea` MCP handler — dispatches after the DB insert. Every dispatch is fire-and-forget; bridge errors are already caught inside each helper but the call sites add a second try/catch for defense. Task: tasks/active/2026-04-19-trial-orchestrator-wire-up.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(trial): wire TrialOrchestrator + GitHub knowledge into POST /api/trial/create Adds two fire-and-forget dispatches after the trial record is written and before the HTTP response returns, via c.executionCtx.waitUntil: 1. TrialOrchestrator DO `start()` — kicks off the alarm-driven state machine that provisions a project, workspace, and discovery agent session. The DO is idempotent on `start()`, so accidental re-invocations no-op. 2. emitGithubKnowledgeEvents() — hits unauthenticated GitHub REST endpoints (`/repos/:o/:n`, `/repos/:o/:n/languages`, `/repos/:o/:n/readme`) in parallel and emits up to `TRIAL_KNOWLEDGE_MAX_EVENTS` `trial.knowledge` events within ~`TRIAL_KNOWLEDGE_GITHUB_TIMEOUT_MS` each. Surfaces description, primary language, stars, topics, license, language breakdown, and README first paragraph so the SSE stream shows activity within ~3s while the VM provisions in the background. Both helpers fully swallow errors — an orchestrator dispatch failure or GitHub rate-limit hit never blocks the response or crashes the Worker. All knobs are env-configurable per Constitution Principle XI: - TRIAL_KNOWLEDGE_GITHUB_TIMEOUT_MS (default 5000) - TRIAL_KNOWLEDGE_MAX_EVENTS (default 10) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(trial): cover orchestrator dispatch, bridge, and GitHub knowledge probe Adds four categories of behavioral tests for the trial onboarding wiring: 1. trial-create.ts.test.ts (+2 cases) - Asserts TrialOrchestrator.start() is dispatched via waitUntil with trialId, repoOwner, repoName, and canonical repoUrl. - Asserts a rejecting start() does NOT propagate — the HTTP response still returns 201 (fire-and-forget contract). - Updates makeEnv() to stub TRIAL_ORCHESTRATOR + TRIAL_EVENT_BUS bindings and introduces makeExecutionCtx() helper. - Also adds a graceful-fallback in create.ts so routes that run without a Worker executionCtx (unit tests) still complete instead of 500-ing on Hono's "This context has no ExecutionContext" throw. 2. trial-github-knowledge.test.ts (new, 5 cases) - Happy path: verifies description, primary language, stars, topics, license, language breakdown, and README paragraph are all emitted. - TRIAL_KNOWLEDGE_MAX_EVENTS cap is enforced. - Total network failure → 0 events, no throw. - Non-2xx repo metadata response → 0 events, no throw. - emitTrialEvent rejection → no throw (last line of defense). 3. trial-orchestrator.test.ts (new, 4 cases) - start() persists initial state with currentStep='project_creation' and schedules an alarm. - start() is idempotent — second call with same input is a no-op and does not re-schedule the alarm. - alarm() on a completed state is a terminal no-op. - alarm() emits trial.error and marks completed when the overall timeout budget is exceeded. 4. trial-bridge.test.ts (new, 9 cases) - bridgeAcpSessionTransition: no-ops on non-trial projects, emits trial.ready on 'running' with ws-{id}.{BASE_DOMAIN} URL, emits trial.error on 'failed', no-ops on other transitions, swallows emitter errors. - bridgeKnowledgeAdded / bridgeIdeaCreated: no-op on non-trial, emit correct event shape when trial exists, swallow errors. All 3,793 tests pass; typecheck clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(trial): document TrialOrchestrator + GitHub knowledge fast-path Adds an "Orchestrator and Fast-Path Knowledge" section to the trial configuration guide covering the two fire-and-forget background tasks dispatched from POST /api/trial/create (TrialOrchestrator DO and the GitHub REST knowledge probe) plus the ACP/MCP event bridge, with tunables tables for both. Also records the change in CLAUDE.md "Recent Changes" and marks the corresponding checklist items in the task file. * style(trial): sort imports per eslint rules Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(trial): emit trial.started event from orchestrator start() The SSE stream's first real event must be `trial.started` so the frontend can transition out of the "Warming up..." empty state. Without it, viewers sat on the placeholder until `trial.progress` or `trial.knowledge` arrived — which could be 3-5s later. Added unit test asserting `emitTrialEvent` is called exactly once with type='trial.started' and the expected shape. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(trial): capability test chaining start() + alarm() through event bus Addresses task-completion-validator HIGH finding #2: no capability test exercised the full orchestrator state machine through the event bus seam. Existing per-method tests covered each transition in isolation but did not chain them. New test drives: start() → persist + setAlarm + emit trial.started → (simulate expired budget) → alarm() → mark failed + emit trial.error The `emitTrialEvent` mock is the event-bus seam; its downstream is already covered by tests/unit/routes/trial-events.test.ts which verifies the bus → SSE stream path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore(trial): archive orchestrator wire-up task Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(trial): cover alarm() retry/backoff + step handler invariants Addresses test-engineer review HIGH findings #1 and #2 (partial). Finding #1 — alarm() retry/backoff: Added 4 tests driving the step-error catch branches via a `./steps` vi.mock. Covers transient-error + retries-remaining (increments counter and schedules backoff, no failTrial), permanent-error (immediate failTrial regardless of budget), transient-error with retries exhausted (promotes to failTrial), and the null-state guard (alarm fires before start()). Finding #2 — step handlers: New `trial-orchestrator-steps.test.ts` covers the two highest-value invariants that don't need D1/DO plumbing mocks: - handleRunning marks state.completed = true - handleDiscoveryAgentStart throws permanent on missing IDs - handleDiscoveryAgentStart is idempotent when session already linked Broader per-handler coverage (project_creation / node_selection / node_provisioning / node_agent_ready / workspace_creation / workspace_ready) tracked in tasks/backlog/2026-04-19-trial-orchestrator-step-handler-coverage.md — those paths require mocks for drizzle + node-agent + project-data services and are out of scope for this PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(trial): remove hardcoded BASE_DOMAIN fallback + extract heartbeat skew constant Addresses constitution-validator findings: HIGH — bridge.ts:41 had `env.BASE_DOMAIN || 'workspaces.example.com'` fallback. BASE_DOMAIN is a non-optional binding; a misconfiguration that let it be empty would silently generate workspace URLs pointing at workspaces.example.com instead of failing loudly. Removed the fallback. MEDIUM — steps.ts had a hardcoded `30_000` heartbeat-skew window. Extracted to DEFAULT_TRIAL_ORCHESTRATOR_HEARTBEAT_SKEW_MS (shared), TRIAL_ORCHESTRATOR_HEARTBEAT_SKEW_MS env override, getHeartbeatSkewMs() getter on the DO, threaded through TrialOrchestratorContext. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(trial): per-IP rate limit on POST /api/trial/create + SSE injection guard Addresses security-auditor HIGH findings: 1. Rate limit on POST /api/trial/create (was missing) - New rateLimitTrialCreate() factory (useIp=true, keyPrefix=trial-create) - Default 10 req/hr, configurable via RATE_LIMIT_TRIAL_CREATE env var - Tighter than the general anonymous bucket because each trial create allocates a Durable Object, fires ~4 GitHub API calls, and consumes a monthly trial slot - Mounted per-route in create.ts so the limiter sees request env - Regression test exercises 429 path with IP-scoped KV window 2. SSE event-name sanitization in formatSse() - Strips CR/LF to prevent SSE-frame injection if a future caller ever bypasses the TrialEvent discriminated union via `as never` casts or dynamic event names - Function now exported for direct testing - New trial-events-format.test.ts covers: happy path stable shape, CR/LF strip on hostile event name (single event frame survives), and JSON data escaping for embedded newlines * fix(trial): switch TrialOrchestrator to new_sqlite_classes + drop premature status gate Addresses cloudflare-specialist HIGH findings: 1. wrangler.toml v9 migration: new_classes -> new_sqlite_classes Cloudflare recommends SQLite-backed storage for new DO classes; the KV-style ctx.storage.put() API works identically on both backends but SQLite is the future-forward choice. TrialOrchestrator has not yet been deployed to any environment (introduced in this PR chain), so flipping the migration type is safe. 2. handleNodeProvisioning: remove synchronous status='running' gate After provisionNode() returns, async-IP providers (Scaleway, GCP) leave the node in 'creating' status — the IP and status='running' flip happens on the first heartbeat. Synchronously requiring status='running' here forced every async-IP trial through the retry/backoff cycle until the heartbeat landed, wasting retry budget and risking permanent failure on slow VM boots. The next step (node_agent_ready) polls heartbeat freshness with its own timeout, which correctly handles both sync (Hetzner) and async (Scaleway/GCP) provisioning paths. Regression test: handleNodeProvisioning advances to node_agent_ready even when provisionNode() leaves the node in 'creating' status. * fix(trial): HMAC-verify fingerprint cookie before reusing UUID Security-auditor HIGH: the old code extracted the fingerprint UUID from the `sam_trial_fingerprint` cookie by splitting on the last `.` without verifying the HMAC signature. An attacker who learned a victim's fingerprint UUID (from logs, a captured cookie, or a prior trial row) could forge `<victimUuid>.anything` to overwrite the `trial-by-fingerprint:<victimUuid>` KV index to point at their own trial. The victim's subsequent OAuth hook lookup would then redirect them to the attacker's trial project. Fix: call verifyFingerprint(existingFp, secret) and only trust the returned UUID. Fall back to crypto.randomUUID() on invalid / missing signature. The secret is already resolved earlier in the same handler (line 195-203). Added regression test in trial-create.ts.test.ts — a forged cookie MUST NOT reuse the victim's UUID; a fresh UUID is minted instead. Updated the "reuses existing fingerprint" test to use a validly-signed cookie. --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* task: move trial-onboarding-ux-polish to active * feat(trial): polish discovery feed with skeleton timeline + knowledge grouping - Extract all timing/threshold constants to trial-ui-config.ts (Constitution XI) - Add STAGE_LABELS map + friendlyStageLabel() for orchestrator stage strings - TryDiscovery: render StageSkeleton timeline before first SSE event arrives - TryDiscovery: group rapid trial.knowledge events into a single card - TryDiscovery: surface "taking longer than usual" hint when SSE silent for 20s - TryDiscovery: retry-aware terminal error panel - ChatGate: spinner + aria-busy on send, snap-x chip scroll, anonymous hint copy - Try: friendlier validation copy, testid hooks for landing audit * test(trial): cover stage-label mapping + skeleton/error/knowledge-burst Playwright cases * task: archive trial-onboarding-ux-polish * fix(trial): SSE replay dedup, accessible badges, larger touch targets Addresses Phase 5 review findings on the trial onboarding UX polish PR: CRITICAL — SSE event replay duplication EventSource silently re-opens after a transport error and the server may replay any buffered events the client missed. Without dedup, the feed duplicated every replayed event. Add a composite (`type:at`) dedup set in TryDiscovery that resets on trialId change. HIGH — color-only ConnectionBadge (WCAG 1.4.1) Status was conveyed by background color alone. Prepend a Unicode shape indicator (●/✕/↺/○) so the meaning is also conveyed in monochrome. HIGH — knowledge toggle hit area (WCAG 2.5.5) The "+N more" toggle on grouped knowledge cards was 24px tall — below the 44px touch-target minimum. Promote to min-h-11 with vertical hit padding. MEDIUM — semantic header role + truncation hint The sticky discovery header used role="banner" (reserved for the page-wide masthead) and the truncated repo title had no full-text hover affordance. Switch to role="region" + aria-label and move the title attribute to the truncating wrapper. LOW — error CTA touch targets The "Try again" / "Join the waitlist" Links were below 44px. Promote to inline-flex min-h-[44px]. Tests - try-discovery-dedup.test.ts: behavioural coverage of eventDedupKey and the dedup branch in onEvent (3 scenarios: identical replay, chronological non-collision, type-vs-timestamp collision). - try-discovery-build-feed.test.ts: boundary coverage of buildFeed (within-window merge, exact-boundary `<=` merge, +1ms split, interleaved non-knowledge break, error-event exclusion). - ChatGate.test.tsx: spinner visible/hidden behavioural test using a deferred promise (idle → sending → resolved transitions). - trial-ui-audit.spec.ts: knowledge-burst test now asserts exactly one grouped card (was: presence only) and exercises the expand toggle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(trial): keep StageSkeleton visible after lone trial.started; forward Alert testid Two narrow fixes uncovered by Playwright visual audit: 1. **StageSkeleton hides too eagerly.** `showSkeleton = events.length === 0` meant a lone `trial.started` event (which is just an acknowledgement, not visible progress) caused the "Setting things up" roadmap to vanish while the user was still staring at a blank screen. Tighten to "no substantive events yet" — keep showing the roadmap until a real progress / knowledge / idea / ready / error event arrives. 2. **`Alert` drops `data-testid`.** The shared design-system `Alert` component didn't declare or forward `data-testid`, so `<Alert variant="error" data-testid="trial-error-panel">` silently discarded the prop and the terminal-error Playwright assertion couldn't find the panel. Add the prop to `AlertProps` and forward it to the rendered `<div role="alert">`. All 45 Playwright trial-ui-audit tests now pass across iPhone SE, iPhone 14, and Desktop projects. --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

) * task: move trial-events-debug to active * task: instrument trial event bus path for staging triage Add high-signal log.info points at every boundary in the trial event flow so `wrangler tail` can show exactly where the pipeline drops: - create.ts: log dispatch_begin, orchestrator_task.{enter,stub_ready, start_returned}, knowledge_task.{enter,done}, waitUntil_registered - trial-runner.ts:emitTrialEvent — log emit_begin / emit_ok - trial-orchestrator: start.enter, state_put, alarm_set, trial_started_emitted; alarm.enter - trial-event-bus: handleAppend.enter / stored / rejected_closed Pure instrumentation — no behavior change. Will be pared back or removed once the failure mode is identified on staging. * fix(trial): emit unnamed SSE frames so EventSource.onmessage fires Root cause of the zero-events-on-staging incident (2026-04-19): formatSse() wrote named SSE frames ('event: trial.knowledge\ndata: {...}') but the frontend subscribes via source.onmessage, which only fires for the default (unnamed) event. Bytes arrived on the wire — curl saw them — but no frontend-visible event was ever dispatched. Change the SSE serializer to emit unnamed frames ('data: {...}'). The TrialEvent payload itself carries a 'type' discriminator so no information is lost. Update the unit test to lock in the new contract (no 'event:' line) and point at the post-mortem. Also fix a latent eventsUrl contract mismatch: POST /api/trial/create returned '/api/trial/events?trialId=X' while the real route is '/api/trial/:trialId/events'. The frontend builds its own URL so end-users weren't affected, but the response-field contract was wrong. The previous unit test used toContain() on a substring, masking the drift. See docs/notes/2026-04-19-trial-sse-named-events-postmortem.md. * test(trial): add TrialEventBus → SSE capability test Regression guard for the 2026-04-19 incident. Seeds a trial in KV, appends events directly on the TrialEventBus DO (identical to emitTrialEvent()), opens the SSE stream via SELF.fetch with a valid fingerprint cookie, reads the raw stream bytes, and asserts: - HTTP 200 + correct content-type - At least one 'data: {...}' frame - No 'event:' line anywhere (the regression guard) - The parsed JSON payload round-trips through the bus intact Also add TRIAL_EVENT_BUS DO binding and TRIAL_* env bindings to the workers vitest config so this test (and future trial-related worker tests) can construct stubs. Note: the existing workers test pool is currently broken on this branch and base (miniflare WebSocket exits unexpectedly on all 6 pre-existing worker tests too — not caused by this change). Once the pool is unblocked this test runs as-is. * docs(trial): post-mortem + rule 13 ban curl-only SSE verification Post-mortem covers what broke, the two-layer contract mismatch (named SSE events + wrong eventsUrl shape), timeline, why it wasn't caught (no E2E capability test, curl used instead of a real browser, frontend test path not exercised), the class of bug, and the process fixes landing in this PR. Update rule 13 (staging verification) to explicitly ban curl-only verification for browser-consumed SSE/WebSocket streams — curl confirms the byte stream, only a real browser confirms dispatch to onmessage. * task: record root cause + fixes on trial SSE events task * test(trial): update trial-events.test SSE assertion for unnamed frames The integration test for GET /api/trial/:trialId/events was asserting the old named-event contract ('event: trial.ready'). With the formatSse() fix the frame is unnamed; update the assertion to lock in the new contract (data: line present, no event: line). * task: archive trial SSE events debugging task * chore(trial): address review findings on SSE events fix - Add TRIAL_ORCHESTRATOR + TRIAL_COUNTER DO bindings to apps/api/vitest.workers.config.ts (cloudflare-specialist MEDIUM) - CLAUDE.md: prepend 'trial-sse-events-fix' entry to Recent Changes (doc-sync-validator MEDIUM) - Fix broken link in postmortem (tasks/active -> docs/notes) and tick the completed rule-13 follow-up checkbox (doc-sync-validator LOW) - Add cross-reference from .claude/rules/02-quality-gates.md to the rule-13 curl-only SSE-verification ban (doc-sync-validator LOW) - File pre-existing HIGH (AbortController not propagated into busStub.fetch) and MEDIUM (nextCursor persistence) as backlog tasks so they're tracked but don't block this fix PR --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

…764) * task: move trial orchestrator agent-boot task to active * feat(trial): boot discovery agent on VM + detect real default branch Two bugs blocked the trial demo from working end-to-end: 1. handleDiscoveryAgentStart only created chat + ACP session records but never called createAgentSessionOnNode / startAgentSessionOnNode. The ACP session sat in `pending` forever, never transitioning to `running`, so `trial.ready` never fired. 2. Project defaultBranch + workspace branch were hardcoded to 'main', so trials on master-default repos (e.g. octocat/Hello-World) failed the VM-side `git clone --branch main`. Fix (mirrors TaskRunner's agent-session-step pattern): - Add `defaultBranch`, `mcpToken`, `agentSessionCreatedOnVm`, `agentStartedOnVm`, `acpAssignedOnVm`, `acpRunningOnVm` fields to TrialOrchestratorState for crash-safe idempotency. - `fetchDefaultBranch()` probes GitHub's public API with a 5s AbortController timeout (TRIAL_GITHUB_TIMEOUT_MS override), falls back to 'main' on any failure. Threaded through both `projects.default_branch` and the workspace-side `git clone --branch`. - `handleDiscoveryAgentStart` now runs a 5-step idempotent flow: 1. startDiscoveryAgent (existing) -> chat + ACP session records. 2. createAgentSessionOnNode (new) -> D1 agent_sessions row + VM agent registers the session. 3. generateMcpToken + storeMcpToken (new) -> KV token so the agent can call add_knowledge / create_idea. 4. startAgentSessionOnNode (new) -> VM agent boots the agent subprocess with the discovery prompt + MCP server URL. 5. transitionAcpSession pending -> assigned -> running -> the trial bridge emits `trial.ready` with workspaceUrl. - Trial's synthetic taskId = state.trialId (trials have no tasks row), so MCP rate-limiting keys per-trial. Drop get_instructions from the initial prompt since it'd 404 against the tasks table. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(trial): capability coverage for orchestrator VM agent boot Adds trial-orchestrator-agent-boot.test.ts asserting the 3-step VM boot pattern + ACP pending→assigned→running transitions + idempotency across crash/retry. Updates trial-orchestrator-steps.test.ts for the new nodeId requirement and adds mocks for node-agent/mcp-token/project-data services. Also adds fetchDefaultBranch coverage (master, 404 fallback, network error fallback, idempotent re-entry). Post-mortem at docs/notes/2026-04-19-trial-orchestrator-agent-boot-postmortem.md. Process fix: adds port-of-pattern coverage bullet to .claude/rules/10-e2e-verification.md so a port of TaskRunner's agent-session pattern into a new consumer must assert every step fired. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * task: archive trial orchestrator agent-boot task * docs(trial): add CLAUDE.md Recent Changes + TRIAL_GITHUB_TIMEOUT_MS row * fix(trial): persist defaultBranch before D1 insert + redact mcpToken in getStatus Cloudflare-specialist review (HIGH): two fixes 1. handleProjectCreation now persists state.defaultBranch before the D1 projects insert. Previously a crash between the D1 write and the DO state persist could cause a retry to re-probe GitHub and resolve a different branch than what had already landed in the projects row. 2. getStatus() now redacts the live mcpToken bearer credential before returning state to any debug/admin caller. The stale comment claiming the DO doesn't store secrets is corrected. * fix(trial): revoke MCP token on failure + redaction test + review doc sync Addresses Phase 5 reviewer findings from the trial-agent-boot PR: security-auditor HIGH: - Revoke state.mcpToken in failTrial() before emitting trial.error. Mirrors TaskRunner's state-machine.ts:265-275 pattern; closes the 4-hour TTL window where a leaked/botched-trial bearer token stays usable. - Document the intentional non-revocation in handleRunning() — orchestrator terminates but the discovery agent still needs the token for MCP calls during the 20-min workspace TTL. - Document the sentinel userId scoping limitation on resolveAnonymousUserId so future trial code remembers that per-user queries do NOT isolate trials from each other; projectId/trialId scoping is mandatory. task-completion-validator MEDIUM: - New test coverage for getStatus() mcpToken redaction (both populated and uninitialized state branches). - New test coverage for failTrial revocation (happy path + KV-error tolerance). doc-sync-validator HIGH: - Add Trial Onboarding section to .claude/skills/env-reference/SKILL.md cross-referencing docs/guides/trial-configuration.md for the full table. * fix(trial): allow multiple trials per repo (partial unique index) The `(user_id, installation_id, repository)` unique index on `projects` prevented more than one anonymous trial per public repo — every trial after the first on the same repo hit a UNIQUE constraint failure during the projects insert in TrialOrchestrator.handleProjectCreation. The DO retried 6 times on alarm backoff then emitted a terminal `trial.error` ("step_failed"), so the user saw the 10% progress event repeat and then fail. Why it slipped through earlier reviews: the capability tests mock D1, so no test exercised the real constraint. Staging verification only tested a single trial per repo. This surfaced the moment a second trial on `octocat/Hello-World` landed during Phase 6 verification. Fix: - Migration 0046 drops + recreates the index as a partial unique index that excludes the trial-sentinel user `system_anonymous_trials`. Real users still can't register duplicate project rows; sentinel-owned trial rows are isolated by `projectId` (per helpers.ts sentinel scope note). - Drizzle schema updated with matching `.where()` clause so codegen and migration stay in sync. Verified locally: trial-orchestrator tests pass (28/28); typecheck clean; lint clean (no new warnings). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

trial.ready is a provisioning milestone (workspace is up), not a signal that discovery is complete. The discovery agent continues producing trial.knowledge and trial.idea events after the workspace is provisioned. Changes: - Event bus: only auto-close on trial.error, not trial.ready - Frontend: keep EventSource open after trial.ready with a 3-minute grace timer (TRIAL_DISCOVERY_STREAM_TIMEOUT_MS) for late-arriving discovery events - Header shows "Discovering <repo>…" while stream is still open after trial.ready, then "Ready: <repo>" after stream closes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…icons - Add TrialAgentActivityEvent type and bridgeAgentActivity() to pipe agent messages/tool calls into the trial SSE stream - Hook message persistence path to emit trial.agent_activity events - Render agent activity cards in the feed (grouped, showing tool names) - Replace misleading "Workspace ready — chat below" with informative message about agent analyzing repository - Replace emoji icons (📎, ★) with lucide-react icons (BookOpen, Lightbulb, Brain, Wrench, Terminal) matching platform design - Add auto-scroll to bottom on new events (scrollIntoView smooth) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Deduplicate consecutive progress events with the same stage in the feed — the orchestrator re-emits keepalive progress while waiting for the agent, creating visual spam (3x "Starting the agent" at 70%) - Clean up agent activity text: strip XML tags, collapse JSON blobs, add line-clamp-2 for overflow - Change "AGENT WORKING..." from uppercase to normal case - Add cleanActivityText() helper for readable tool output summaries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…g-mvp

…into sam/trial-onboarding-mvp

Merge sam/trial-discovery-stream-fix into trial MVP branch, bringing: - Auto-scroll to bottom on new events - Agent activity cards grouped in feed with Lucide icons - Progress card deduplication and text cleanup - Stream stays open after trial.ready (agent continues producing events) - Default model switched to Qwen 3 30B Update trial-event-bus test to match new behavior: trial.ready no longer closes the bus since the discovery agent continues producing knowledge and idea events after workspace provisioning. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add AI usage section to the admin analytics dashboard, powered by the AI Gateway Logs API. Shows token usage, estimated cost, trial vs. authenticated breakdown, per-model metrics, and daily trends. Backend: - New admin endpoint GET /api/admin/analytics/ai-usage?period=7d queries AI Gateway logs with pagination and aggregates by model/day - AI proxy now tags requests with projectId and trialId in cf-aig-metadata for trial usage attribution - Configurable via AI_USAGE_PAGE_SIZE, AI_USAGE_MAX_PAGES env vars Frontend: - AIUsageChart component with KPI cards, stacked bar chart (tokens by model), daily usage area chart, and model breakdown table - Integrated into admin analytics dashboard above DAU chart - Graceful fallback if AI Gateway is not configured (catch + null) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…stics The CF AI Gateway Logs API uses `order_by_direction` (not `direction`) for sort order, and error responses now include the upstream body for easier debugging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The Cloudflare AI Gateway Logs API enforces a maximum per_page of 50. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(trial): address review findings from trial onboarding subagents Security and correctness fixes from 7 specialist reviewers: CRITICAL: - Fix cookie domain mismatch: claim.ts clearClaimCookie and oauth-hook.ts buildClaimCookie now pass domain from BASE_DOMAIN (matching create.ts) HIGH: - TrialEventBus DO: persist `closed` flag to storage so it survives eviction - AI proxy: sanitize error bodies — log raw errors server-side, return generic messages to clients (prevents internal URL/config leakage) - Admin AI usage: sanitize CF API error responses the same way - SSE events endpoint: add per-IP rate limiting (30 req/5min via KV) - Deploy pipeline: forward ANTHROPIC_API_KEY_TRIAL as optional Worker secret - sync-wrangler-config: inject ENVIRONMENT var into generated env sections - Remove hardcoded DEFAULT_GATEWAY_ID; require AI_GATEWAY_ID from env MEDIUM: - Cron collision: move trial counter rollover from 03:00 to 05:00 UTC (avoids collision with daily analytics forward job at 03:00) - Replace magic number in create.ts with DEFAULT_TRIAL_CLAIM_TTL_MS constant - Add trial secrets to secrets-taxonomy.md and trial-configuration.md - Add comprehensive trial + AI proxy env vars to .env.example - Fix test mocks: add ctx.storage to TrialEventBus tests, add KV to SSE tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(trial): address CTO review — 6 quality improvements 1. Reject unknown IP: SSE rate limit now returns 400 when no client IP header is present, instead of sharing a single "unknown" bucket across all headerless clients. CF-Connecting-IP is always present on Workers. 2. Document KV rate limit trade-off: added inline comment explaining why KV's non-atomic read-modify-write is acceptable here (storm prevention, not exact enforcement) vs DO-based counters for credential rotation. 3. Clean up formatSse: removed unused _eventName parameter that gave the false impression the event name was being used. Updated all call sites and tests. 4. Cookie domain consistency test: new regression test suite asserting that buildClaimCookie, clearClaimCookie, and buildFingerprintCookie produce matching Domain= attributes. Explicitly demonstrates the bug where clearing without a domain fails to delete a domain-scoped cookie. 5. AI_GATEWAY_ID self-hoster safe: returns an empty summary (zero counts) when AI_GATEWAY_ID is not configured, instead of throwing. Self-hosters who don't use AI Gateway get a clean "no data" admin dashboard. 6. Fix .env.example cron default: TRIAL_CRON_ROLLOVER_CRON now shows "0 5 1 * *" matching the actual default after the collision fix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Resolves package.json version conflict (take main's newer deps) and fixes simple-import-sort/exports error in packages/shared/src/constants/index.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Autofix export sort in apps/web/src/lib/api/index.ts - Move useMemo before early return in AIUsageChart (rules-of-hooks) - Prefix unused anthropicModels with _ in staging test - Add FILE SIZE EXCEPTION comments for TryDiscovery.tsx and steps.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sonarqubecloud · 2026-04-21T04:53:27Z

Quality Gate failed

Failed conditions
6 Security Hotspots
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Covers the trial onboarding MVP (PR #758), AI proxy Anthropic routing, Codex scope validation backfire (PR #772), and the seven-reviewer cleanup (PR #770). Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

raphaeltm and others added 9 commits April 18, 2026 20:17

merge: wave-1 track-a backend lifecycle into trial onboarding integra…

4b87b6b

…tion

merge: wave-1 track-b backend claim + SSE streaming into trial onboar…

ac6c7b9

…ding integration

merge: wave-1 track-d chat gate + login sheet into trial onboarding i…

a2342ad

…ntegration

simple-agent-manager Bot added the needs-human-review Agent could not complete all review gates — human must approve before merge label Apr 18, 2026

simple-agent-manager Bot temporarily deployed to staging April 18, 2026 22:00 Inactive

simple-agent-manager Bot had a problem deploying to staging April 18, 2026 22:00 Failure

fix(deploy): remove invalid --remote flag from wrangler kv key put

b15ca27

`wrangler kv key put` writes to remote by default; --remote is not a valid flag for that subcommand and caused the staging deploy's trial kill-switch step to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

simple-agent-manager Bot temporarily deployed to staging April 18, 2026 22:08 Inactive

simple-agent-manager Bot temporarily deployed to staging April 18, 2026 22:13 Inactive

simple-agent-manager Bot temporarily deployed to staging April 18, 2026 22:22 Inactive

simple-agent-manager Bot temporarily deployed to staging April 18, 2026 22:26 Inactive

simple-agent-manager Bot and others added 2 commits April 19, 2026 10:41

simple-agent-manager Bot mentioned this pull request Apr 19, 2026

feat(web): support deeply nested chat sessions with context anchors #759

Merged

29 tasks

simple-agent-manager Bot and others added 5 commits April 19, 2026 15:58

Merge branch 'sam/ai-proxy-anthropic-models' into sam/trial-onboardin…

30f2373

…g-mvp

simple-agent-manager Bot temporarily deployed to staging April 20, 2026 21:35 Inactive

simple-agent-manager Bot temporarily deployed to staging April 20, 2026 21:40 Inactive

raphaeltm and others added 2 commits April 21, 2026 01:17

Merge remote-tracking branch 'origin/sam/trial-discovery-stream-fix' …

cb40940

…into sam/trial-onboarding-mvp

simple-agent-manager Bot temporarily deployed to staging April 21, 2026 01:21 Inactive

simple-agent-manager Bot temporarily deployed to staging April 21, 2026 01:25 Inactive

simple-agent-manager Bot temporarily deployed to staging April 21, 2026 02:00 Inactive

simple-agent-manager Bot temporarily deployed to staging April 21, 2026 02:05 Inactive

simple-agent-manager Bot temporarily deployed to staging April 21, 2026 02:10 Inactive

simple-agent-manager Bot temporarily deployed to staging April 21, 2026 02:14 Inactive

fix(api): reduce AI Gateway page size to CF max of 50

5c2e5cd

The Cloudflare AI Gateway Logs API enforces a maximum per_page of 50. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

simple-agent-manager Bot temporarily deployed to staging April 21, 2026 02:17 Inactive

simple-agent-manager Bot temporarily deployed to staging April 21, 2026 02:22 Inactive

simple-agent-manager Bot and others added 5 commits April 21, 2026 04:25

Merge origin/main into sam/trial-onboarding-mvp + fix export sort lint

e3345fc

Resolves package.json version conflict (take main's newer deps) and fixes simple-import-sort/exports error in packages/shared/src/constants/index.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(lint): sort imports in workspaces/runtime.ts

f3e2716

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(lint): remove unused anthropicModels variable

ea9b4b1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

simple-agent-manager Bot merged commit 1f92ecf into main Apr 21, 2026
16 of 19 checks passed

simple-agent-manager Bot mentioned this pull request Apr 21, 2026

docs(blog): SAM daily journal — try before you sign up #776

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(trial): zero-friction URL-to-workspace onboarding MVP#758

feat(trial): zero-friction URL-to-workspace onboarding MVP#758
simple-agent-manager[bot] merged 35 commits intomainfrom
sam/trial-onboarding-mvp

simple-agent-manager Bot commented Apr 18, 2026 •

edited

Loading

Uh oh!

simple-agent-manager Bot commented Apr 18, 2026

Uh oh!

sonarqubecloud Bot commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

simple-agent-manager Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

cc @raphaeltm — Configuration Checklist Before Merge

Staging (sammy.party) — zero manual steps required

Production (simple-agent-manager.org) — one manual step (the key)

Cookies

Kill Switches

What Shipped

Wave 0 — Foundation (e253c08e)

Wave 1 Track A — Backend Lifecycle (4ca29ea6)

Wave 1 Track B — Backend Claim + SSE (6ba2e101)

Wave 1 Track C — Frontend Discovery (e8088705)

Wave 1 Track D — Frontend Chat Gate (1114c8fc)

Wave 2 — Integration, Automation, and Live Fix

Non-negotiable Constraints Verified

Local Quality Gates

Staging Deployments

Staging Verification (Playwright + curl, live app)

Regression spot-check

What was NOT verified end-to-end

Review Status

Do NOT Merge Yet

Uh oh!

simple-agent-manager Bot commented Apr 18, 2026

Staging verification update — trials automation + integration fix

Uh oh!

sonarqubecloud Bot commented Apr 21, 2026

Quality Gate failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

simple-agent-manager Bot commented Apr 18, 2026 •

edited

Loading

Staging (`sammy.party`) — zero manual steps required

Production (`simple-agent-manager.org`) — one manual step (the key)

Wave 0 — Foundation (`e253c08e`)

Wave 1 Track A — Backend Lifecycle (`4ca29ea6`)

Wave 1 Track B — Backend Claim + SSE (`6ba2e101`)

Wave 1 Track C — Frontend Discovery (`e8088705`)

Wave 1 Track D — Frontend Chat Gate (`1114c8fc`)